A Bayesian network model for protein fold and remote homologue recognition

نویسندگان

  • A. Raval
  • Zoubin Ghahramani
  • David L. Wild
چکیده

MOTIVATION The Bayesian network approach is a framework which combines graphical representation and probability theory, which includes, as a special case, hidden Markov models. Hidden Markov models trained on amino acid sequence or secondary structure data alone have been shown to have potential for addressing the problem of protein fold and superfamily classification. RESULTS This paper describes a novel implementation of a Bayesian network which simultaneously learns amino acid sequence, secondary structure and residue accessibility for proteins of known three-dimensional structure. An awareness of the errors inherent in predicted secondary structure may be incorporated into the model by means of a confusion matrix. Training and validation data have been derived for a number of protein superfamilies from the Structural Classification of Proteins (SCOP) database. Cross validation results using posterior probability classification demonstrate that the Bayesian network performs better in classifying proteins of known structural superfamily than a hidden Markov model trained on amino acid sequences alone.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Targeted Deletion of Los1 Homologue Affects the Production of a Recombinant Model Protein in Pichia pastoris

Background: The methylotrophic yeast Pichia pastoris is an appealing production host for a variety of recombinant, including biologics. In this sense, various genetic- and non-genetic-based techniques have been implemented to improve the production efficiency of this expression platform. Los1 (loss of supression) encodes a non-essential nuclear tRNA exporter in Saccharomyces cerevisiae, which i...

متن کامل

Risk Analysis of Operating Room Using the Fuzzy Bayesian Network Model

To enhance Patient’s safety, we need effective methods for risk management. This work aims to propose an integrated approach to risk management for a hospital system. To improve patient’s safety, we should develop flexible methods where different aspects of risk and type of information are taken into consideration. This paper proposes a fuzzy Bayesian network to model and analyze risk in the op...

متن کامل

Using fold recognition to search for useful proteins: Bayesian approach to fold recognition

The wealth of protein sequence and structure data is greater than ever, thanks to the ongoing Genomics and Structural Genomics projects. The information available through such efforts needs to be analysed by new methods that combine both databases. One important result of genomic sequence analysis is the inference of functional homology among proteins. Until recently sequence similarity compari...

متن کامل

FALCON@home: a high-throughput protein structure prediction server based on remote homologue recognition

SUMMARY The protein structure prediction approaches can be categorized into template-based modeling (including homology modeling and threading) and free modeling. However, the existing threading tools perform poorly on remote homologous proteins. Thus, improving fold recognition for remote homologous proteins remains a challenge. Besides, the proteome-wide structure prediction poses another cha...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Bioinformatics

دوره 18 6  شماره 

صفحات  -

تاریخ انتشار 2002